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FOREWORD 

— 

J 

This research was performed under Task Area PF55.521.005.01.08 
(The Prediction of Performance). It was carried out in-house to 
investigate questions that arose during the development of experi¬ 
mental classification tests for Category IVs and Blacks in connection 
with Work Unit SD.01 (Development of Screening, Selection and 
Classification Instruments and Procedures for Marginal Personnel). 

Results of the test development research will be published shortly. 

The assistance of the Naval Training Center, San Diego, in 
conducting the study is gratefully acknowledged. Extensive computer 
analyses on the project data sets were carried out by Ms. Nancy 
Neffson. 


J. J. CLARKIN 
Commanding Officer 
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SUMMARY 


Problem 


Research conducted at NPRDC has focused on developing tests 
that would predict the performance potential of Category IV and 
Black enlisted personnel more accurately than does the operational 
classification battery. In interpreting the results of this 
research, the motivational conditions prevailing during administra¬ 
tion of both the experimental and the operational batteries must be 
considered. Previous research results indicate that general orienting 
statements made before tests are given can affect the level and inter¬ 
relationships of test results. The present study was designed to 
identify the effects of different instructional conditions on test 
performance. Its objectives were: (a) to determine whether different 
pretest instructions are associated with different levels of test 
performance of Category IVs, Category I-IIIs, Blacks, and non-Blacks, 
(b) to determine which types of pretest instructions, if any, serve 
to maximize total group and/or subgroup test performance, and (c) to 
estimate the effects of different types of instructions on test 
performance. 


Approach 


A battery consisting of five recently developed experimental 
tests was administered to four different samples of enlisted 
recruits. The tests varied in the extent to which they required 
cognitive, perceptual, and psychomotor abilities. Comparisons of 
subgroup means on the experimental tests were made by analyses of 
variance and covariance. Results were analyzed for Category IV and 
non-IV, and Black and non-Black subgroups. 


Findings and Conclusions 

1. The performance of Category IV personnel on the two most 
cognitive of the experimental tests was lowered significantly when 
test administration instructions did not contain motivating state¬ 
ments. No similar effects were found for IVs on the low-cognitive 
tests or for non-IVs on any of the tests. Differences in pretest 
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instructions did not significantly affect the performance of either 
the Black or the non-Black subgroup (pages 5-9). 

2. More than 16% of the Blacks in the study identified them¬ 
selves as Caucasian during the testing situation. This suggests 
that, whenever possible, future research concerned with racial bias 
should include provisions for independently estimating the accuracy 
of questionnaire responses (pages 8 and 9). 


Recommendations 


1. Motivating instructions should be provided before tests are 
administered to Navy enlisted personnel to ensure that the test 
performance of Category IVs is consistent with their abilities 
(pages 5-9). 

2. Future research with experimental questionnaires using 
self-reported biographical data should provide for independent 
checks of the accuracy of question responses whenever possible. 
Scales based on self-reported biographical data should not be 
adopted for operational use until such checks have been made (pages 
8 and 9). 
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A COMPARISON OF THE INFLUENCE OF INSTRUCTIONAL SET ON TEST 
RESULTS FOR MENTAL LEVEL AND RACIAL GROUPS 


BACKGROUND AND PURPOSE 


During recent years, the Center has conducted a program 
concerned with developing and validating tests for evaluating 
low mental level (Category IV) and/or Black enlisted personnel. 

It has been felt that the present classification tests, which 
emphasize academic "trainable" types of abilities rather than 
practical intelligence, are not appropriate for such personnel. 

The success of this program will help to ensure that all personnel 
are used to their full effectiveness. 

The motivation of the subjects is an important point to be 
considered in the administration of tests. For test results to be 
most valid, subjects must be motivated to do their best. Test 
performance can also be influenced by other conditions. For example, 
it has been found that the performance of Blacks on written tests was 
affected by the race of the examiner (Katz, Roberts, & Robinson, 1965; 
Kennedy & Vega, 1965; Katz, Henchy, & Allen, 1968). Experimentally- 
induced anxiety and terms used to describe a test have also been 
contributing factors (Katz & Greenbaum, 1963; Katz et al., 1965). 

Recruits are presently motivated to do their best on classifi¬ 
cation tests because they know that their Navy job assignments 
depend on their test scores. Similar incentives were not available 
for the experimental tests, but it was felt that performance would 
be maximized if they were administered early in recruit training, 
when recruits take and are encouraged to do their best on many 
tests. However, observations by the testing staff indicated a 
substantial lack of interest in the experimental tests, which 
contrasted with the general attitude prevailing during administration 
of the operational tests. Thus, the question was raised as to whether 
motivation could be improved by modifying test instructions. 

The present study had the following objectives: 

1. To determine whether the different conditions prevailing 
during the administration of tests are associated with different 
levels of test performance of Category IVs, Category I-IIIs, 

Blacks, and non-Blacks. 
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2. To determine which types of instructions, if any, serve 
to maximize total group and/or subgroup test performance. 

3. To estimate the effects of different types of instructions 
on test performance. 


PROCEDURES 


Data Collection 

The experimental tests used for the study were administered 
at the Naval Training Center (NTC), San Diego, to men in their 
second week of recruit training about 1 week after the regular 
Navy classification tests were given. The administrators were two 
Caucasian Chief Petty officers. Testing sessions lasted about two 
hours. 


Samples 

Four samples, ranging in size from 392 to 518 men (seven to 
nine companies), were used. Each sample provided 5 days of input 
into experimental testing. Directly procured Filipino TNs were 
eliminated from the samples prior to test administration because 
it was felt that, because of language problems, the test performance 
of these men would not be comparable to that of other recruits. 

The independent variables consisted of four different sets of 
orienting instructions. Each set was read to a sample before 
testing was begun. A prime objective of one of the sets was to 
instill a maximum motivational condition (Implied Threat) that 
would be equivalent to that existing for the operational tests. 
Summaries of the instructions and the Ns for the four conditions 
are shown in Table 1. Copies of the instructions are given in 
the Appendix. 


Dependent Variables and Covariates 


Experimental tests. The following five tests served as 
dependent variables. They were chosen as being representative 
of several different types of experimental tests under evaluation. 
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(1) Memory for Numbers Test (MEM)—a paced test of memory 
span, administered by tape recording. It is similar to the memory 
span tests frequently used in IQ tests. The recording presents a 
series of 4 to 10 digits, a short period of silence, and a request 
that the subject write the numbers in the correct sequence. The 
test includes 21 number series, having a total of 146 digits. 

Score on MEM consists of the number of digits correctly recorded. 

(2) Dominoes (DOM)—an 88-item reasoning test involving 
the determination of similarities and differences among pictorial 
representations of dominoes. 

(3) Word Finding Test (WORD)—a 60-item speed test 
involving matching stimulus and response words in different 
columns of a page. 

(4) Listening Skills Test , Revised (LST)—a 35-item 
recorded, paced test involving both assimilation and low-level 
reasoning using simple, aurally-presented data. 

(5) Maze Test (MAZ)—a group-administered speed test, 
patterned after the Porteus Maze Test. Test subjects are given 
six maze patterns, each having five entrances and a goal box, and 
are asked to identify the entrances leading to the goal box. 

Scores for the last four tests are totals of the correct 
answers. All five experimental tests used for the study were 
developed or adapted by this Center to be especially appropriate 
for lower mental level personnel. 


Biographical variables and operational tests . Scores on the 
following biographical variables and operational tests were used 
as covariates: 

(6) Socioeconomic Status (SES)--a variable based on 
responses to questions concerning home and environmental 
characteristics, and parents' educational and occupational status. 

(7) Rural-urban Origin (RUR)—a binary variable coded 1 if 
the man's family lived on a farm or in a small town during his 
teenage years and 0 if otherwise. 

(8) Years of Education (YR-ED). 
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(9) Armed Forces Qualification Test (AFQT)—a measure of 
vocabulary, arithmetic reasoning, spatial reasoning, and knowledge 
of tools and equipment. The test score is expressed as a percentile 
rank. 


(10) General Classification Test (GCT)—a measure of ability 
to comprehend and define words and to reason verbally. 

(11) Arithmetic Reasoning Test (ARI)—a measure of quanti¬ 
tative aptitude involving mathematical-reasoning and problem solving. 

(12) Mechanical Test (MECH)--a measure of basic mechanical 
and electrical knowledge and of comprehension of mechanical 
principles and relationships. 

(13) Clerical Test (CLER)—a measure of perceptual speed and 
accuracy that requires checking to determine whether pairs of numbers 
are the same or different. 

Scores on the Navy operational tests (10 to 13 above) are 
expressed as Navy Standard Scores. These scores have means of 
about 50 and standard deviations of about 10 for a typical full- 
range recruit population. 


Analysis 

Scores on the experimental tests were merged with the biographi¬ 
cal variable and operational test scores to form records for subjects 
in all the instructional condition samples. This tape was sorted 
into files of Blacks and non-Blacks, and then into files of IVs and 
non-IVs. For each of the four files, the four condition means for 
each of the experimental tests were tested for significant differences 
in a one-way analysis of variance design. If significant differences 
were found, the means differing significantly were identified, using 
Duncan's New Multiple Range Test as adjusted for different Ns. As 
a check on the equality of abilities within the Black and IV sub¬ 
groups, analyses of variance and Duncan's tests were computed for 
the eight biographical and operational test variables. For Category 
IVs, the group having significant subgroup differences, an analysis 
of covariance was conducted on the five experimental test scores 
using AFQT and GCT as covariates. 


5 






















RESULTS 


Comparisons of the Means of the Experimental Variables 

Subgroup means for experimental test and instructional conditions 
are shown in Table 2, together with the results of comparisons using 
the multiple-range test. For clarity of presentation, the subgroup 
means were converted to z_ means by subtracting the mean of their 
row and dividing by the row standard deviation. Thus, the mean 
score of a row would be approximately 0, and differential effects 
of the instructional conditions on test performance for a subgroup 
would be reflected in variations of the means in the rows. Entries 
in the last section of Table 2 consist of averages of the 5 z^ means 
for each instructional condition subgroup. Raw score means and 
standard deviations for the five experimental tests are shown in 
Table 3. 

The differences among the means shown in Table 2 were not 
statistically significant, and were especially small for non-IVs 
and non-Blacks. Differences among IV and Black means were larger 
and appeared to be consistently associated with specific instructional 
conditions. For instance, the £ means of Blacks were lower than those 
of non-Blacks for all five tests administered under Exhortation 
conditions (£ < .06 by sign test), and were considerably higher than 
£ means of non-Blacks for three of the tests administered under 
Reassurance instructions. Similarly, the z_ means of IVs were 
higher than those of non-IVs for all five tests administered under 
REA (£ < .06 by sign test) and were lower than those of the non-IVs 
for four of the five tests administered under Implied Threat. These 
characteristics of the data were worth exploring further because the 
small Ns for IVs and Blacks substantially reduced for these groups 
the power of the analyses of variance. Therefore, the possibility 
of other systematic differences among the subgroups was investi¬ 
gated by computing analyses of variance for the comparison variables. 


Further Comparisons 

Two of the 16 analyses of variance conducted with the comparison 
variables were statistically significant. For Blacks, none of the 
analyses was significant, compared with significant differences in 
the AFQT and GCT means of IVs. The means of the IV instructional 
condition subgroups on these variables are shown in Table 4. 
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TABLE 2 

Subgroup z Means Computed by Rows for Five 
Experimental Tests and an Overall Average 


Instructional Condition a 


Test 

Subgroup 

NI 

REA 

EX 

THR 

MEM 

Non-IV 

00 

-02 

05 

-04 


IV 

-07 

20 

-07 

-03 


Black 

-48 

46 

-24 

21 


Non-Black 

-02 

01 

05 

-06 

DOM 

Non-IV 

01 

-01 

01 

-01 


IV 

-06 

22 

00 

-15 


Black 

-05 

-08 

-08 

21 


Non-Black 

-05 

04 

04 

-04 

WORD 

Non-IV 

05 

-01 

-01 

03 


IV 

05 

03 

03 

-23 


Black 

-20 

-13 

-13 

-06 


Non-Black 

04 

01 

01 

-01 

LST 

Non-IV 

06 

00 

01 

-08 


IV 

04 

25 

-01 

-20 


Black 

27 

25 

-12 

-04 


Non-Black 

00 

04 

03 

-09 

MAZ 

Non-IV 

06 

-06 

-01 

02 


IV 

21 

04 

-09 

-16 


Black 

04 

25 

-19 

15 


Non-Black 

06 

-04 

01 

-02 

Overall 

Non-IV 

04 

-03 

01 

-02 

Average 

IV 

03 

17 

-03 

-15 


Black 

-08 

27 

-15 

09 


Non-Black 

01 

00 

03 

-04 


Note. 


1. Decimal points are omitted from the z means 
( Condition mean Total subgroup moan) 

( Total subgroup s.d. ) 

2. None of the differences among the means in this table 
was statistically significant at p < .05. 

3. Sample sizes are: NI, 348, 65, 4, 409; REA; 414, 57, 

11, 460; EX, 441, 77, 27, 441; THR, 329, 63, 16, 376. 

a Abbreviations: NI—No Instructions. REA—Reassurance. 

EX—Exhortation. THR—Implied Threat. 
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TABLE 3 


Means and Standard Deviations for Each Experimental Test Arranged 
by Instructional Condition and Mental Level/Racial Group 


Test 

Subgroup 

Instructional Condition 

Overall 

NI 

REA 

EX 

THR 

Mean 

S.D. 

MEM 

Non-IV 

112.39 

111.95 

113.17 

111.71 

112.35 

16.29 


IV 

95.50 

100.58 

95.49 

96.18 

96.81 

18.83 


Black 

96.50 

114.64 

101.15 

109.69 

105 74 

19.17 


Non-Black 

109.96 

110.53 

111.24 

109.23 

110.32 

17.4b 

DOM 

Non-IV 

42.26 

42.04 

42.28 

42.05 

42.16 

8.61 


IV 

29.19 

32.07 

29.84 

28.29 

29.84 

10.10 


Black 

27.75 

27.45 

27.44 

30.38 

28.28 

10.12 


Non-Black 

40.31 

41.20 

41.17 

40.44 

40.82 

9.54 

WORD 

Non-IV 

33.85 

33.12 

33.42 

33.73 

33.51 

6.91 


IV 

28.87 

29.56 

28.76 

26.71 

28.51 

7.88 


Black 

27.00 

32.27 

27.59 

28.12 

28.59 

7.88 


Non-Black 

33.21 

32.62 

33.02 

32.81 

32.92 

7.18 

LST 

Non-IV 

27.73 

27.45 

27.49 

27.10 

27.45 

4.28 


IV 

19.03 

20.44 

19.19 

18.25 

19.22 

4.91 


Black 

20.75 

20.64 

18.44 

18.88 

19.14 

5.92 


Non-Black 

26.51 

26.73 

26.69 

26.09 

26.53 

5.03 

MAZ 

Non-IV 

23.37 

22.71 

22 97 

23.12 

23.02 

5.45 


IV 

19.34 

18.21 

17.31 

16.80 

17.90 

6.94 


Black 

16.00 

17.55 

14.26 

16.81 

15.69 

7.51 


Non-Black 

22.89 

22.27 

22.55 

22.38 

22.52 

5.79 
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TABLE 4 

Means of Category TVs on AFQT and GCT 




Instructional 

Condition 


Test 

NI 

REA 

EX 

THR 

AFQT 

23.32 

21.98 

20.35 

19.07 

GCT 

42.90 

44.25 

40.83 

38.59 


Note . For AFQT, the NI mean was significantly 
greater than the THR and EX means (p <.01). For 
GCT, the REA mean was significantly greater than 
the NI, EX, and THR means and the NI mean was 
significantly greater than the THR mean (all dif¬ 
ferences at p <.01). 


Duncan's multiple range tests found the mean of the NI subgroup 
on AFQT to be significantly higher than the AFQT means of the other 
subgroups. On GCT, REA had the highest mean of any subgroup and NI 
the next highest mean. Past research results showed that both AFQT 
and GCT have substantial correlations with some forms of the experi¬ 
mental tests used in this study (Thomas, 1969). Thus, it was possible 
that differences on these variables might be influencing the differences 
observed on the dependent variables. To check this possibility, an 
analysis of covariance was conducted for IVs using AFQT and GCT as 
covariates. 

Results of the analysis confirmed the covariation of GCT and 
AFQT with instructional conditions. When this influence was removed 
from the data (Table 5), the IV means for both MEM and DOM, the 
high-cognitive tests in the experimental battery, were significantly 
lower under NI than under any type of orienting instructions. Thus, 
the motivation and performance of IVs taking high-cognitive tests 
appears to be improved by motivating instructions. 

One-sixth (16.7%) of the men coded Black on the Enlisted Master 
Tape Record (EMTR) identified themselves as Caucasian on the SES 









TABLE 5 


Original and Adjusted Means of Category IVs 
on the Five Experimental Tests 





Instructional 

Condition 

Experimental 

Test 

Mean 

NI 

(N= 65) 

REA 
(N= 57) 

EX 

(N=77) 

THR 
(N= 63) 

1 . 

MEM 

Orig. 
Adjust. 

95.50 

89.35 

100.58 

98.90 

95.49 

97.52 

96.18 

98.52 

2. 

DOM 

Orig. 

Adjust. 

29.19 

26.26 

32.07 

30.79 

29.84 

30.47 

28.29 

31.42 

3. 

WORD 

Orig. 

Adjust. 

28.87 

28.75 

29.56 

28.67 

28.76 

28.92 

26.71 

28.09 

4. 

LST 

Orig. 

Adjust. 

19.03 

18.56 

20.44 

19.51 

19.19 

19.51 

18.25 

19.95 

5. 

MAZ 

Orig. 

Adjust. 

19.34 

19.16 

18.21 

17.45 

17.31 

17.50 

16.80 

17.91 


Note . For both MEM and DOM the adjusted means for NI 
were lower than those for the three other conditions at 
p <.05. 


questionnaire. Since the EMTR is an official record, it seems 
reasonable that it is the more accurate of the two sources. Thus, 
the Blacks answering the questionnaire incorrectly either did not 
pay sufficient attention to this item or falsified their answer. 

An error rate of 17% per question would seriously reduce the 
accuracy of a scale that might be formed from biographical informa¬ 
tion for use in classification or assignment. Therefore, it would 
be desirable to see if this rate of errors is (1) atypical, (2) 
typical only for certain types of questions, (3) typical for the 
bulk of questions for certain groups having particular education or 
racial characteristics, or (4) typical of all questions for the 
generality of enlisted personnel. 
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FINDINGS, CONCLUSIONS, AND RECOMMENDATIONS 


1. An important finding of the study is that Category IV 
personnel performed significantly better on cognitive tests if 
the motivating conditions were made explicit. However, the test 
performances of non-IV, Black, and non-Black groups were not 
significantly affected by increasing motivational conditions. 

The expected maximum influence of this effect would be 
a slight lowering of the validities of high-cognitive tests for 
Category IVs. No similar effect would be expected for low- 
cognitive tests. Thus, promising low-cognitive tests from the 
present studies may safely be used for operational classification 
decisions. Promising high-cognitive tests found in the ongoing 
research may be used operationally, provided highly motivating 
instructions were used during the experimental administration 
phase of their development. Otherwise, these tests should be 
used with caution and early follow-up research should be conducted 
to check their effectiveness. 

2. About 17% of the Blacks in the study identified themselves 
as Whites during the testing situation—a rate of errors which, 

if typical, would seriously lower the accuracy of biographical 
information scales. Future research with experimental question¬ 
naires using self-reported biographical information should provide 
for independent checks of the accuracy of question responses 
whenever possible. No scale based on self-reported biographical 
information should be adopted for operational use until the accuracy 
of the self-report information has been substantiated. 
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APPENDIX 


INSTRUCTIONS READ FOR EACH OF THE SAMPLES 


1. No Instructions: 

Monitor says nothing before administration of the tests. At 
the conclusion, he says, "The tests which you have finished 
taking are experimental in nature and may be used for future 
classification and assignment in the Navy. Are there any 
questions?" 

2. Reassurance: 

"The tests which you will take today are experimental tests 
which will be used for research purposes only. The results 
will not affect your career in the Navy. However, they may 
be used for future assignment of personnel, so you should do 
your best. If you do not know the answer to a question, 
answer using your best guess." 

3. Exhortation: 

"These are special tests which will be used in the future 
classification of enlisted personnel. Answer each question 
to the best of your ability. If you do not know the answer, 
guess. Often your best hunch will be right." 

4. Implied Threat: 

"These are special tests for use in the evaluation and assign¬ 
ment of enlisted personnel, so you should do your best on 
them. If you are not sure of the answer to a question, guess. 
Often your best hunch will be correct. Results of the tests 
will be forwarded to your companies and will be posted in about 
four weeks." 
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